The impact of amplification on di↵erential expression analyses by RNA-seq
نویسندگان
چکیده
Correspondence: [email protected] Anthropology and Human Genomics, Department of Biology II, Ludwig Maximilians University Munich, Grosshaderner Str. 2, D-82152 Martinsried, Germany Full list of author information is available at the end of the article Abstract Background Currently quantitative RNA-Seq methods are pushed to work with increasingly small starting amounts of RNA that require PCR amplification to generate libraries. However, it is unclear how much noise or bias amplification introduces and how this e↵ects precision and accuracy of RNA quantification. To assess the e↵ects of amplification, reads that originated from the same RNA molecule (PCR-duplicates) need to be identified. Computationally, read duplicates are defined via their mapping position, which does not distinguish PCRfrom natural duplicates that are bound to occur for highly transcribed RNAs. Hence, it is unclear how to treat duplicate reads and how important it is to reduce PCR amplification experimentally. Results Here, we generate and analyse RNA-Seq datasets that were prepared with three di↵erent protocols (Smart-Seq, TruSeq and UMI-seq). We find that a large fraction of computationally identified read duplicates can be explained by sampling and fragmentation bias. Consequently, the computational removal of duplicates does not improve accuracy, power or false discovery rates, but can actually worsen them. Even when duplicates are experimentally identified by unique molecular identifiers (UMIs), power and false discovery rate are only mildly improved. However, we do find that power does improve with fewer PCR amplification cycles across datasets and that early barcoding of samples and hence PCR amplification in one reaction can restore this loss of power. Conclusions Computational removal of read duplicates is not recommended for di↵erential expression analysis. However, the pooling of samples as made possible by the early barcoding of the UMI-protocol leads to an appreciable increase in the power to detect di↵erentially expressed genes.
منابع مشابه
Regulatory effects of cis- and trans-LncRNAs on differential expression of genes following infection with viral hemorrhagic septicemia virus in rainbow trout (Oncorhynchus mykiss)
In this study the cis and trans regulatory effect of long non-coding genes (lncRNA) on the expression of genes in fish infected by Viral hemorrhagic septicemia virus (VHS) was investigated using RNA-seq technology. At the end of experimental period (the thirty fifth day), total RNA was extracted from spleen tissue (group treated with virus) and physiological serum (control group) was used to pr...
متن کاملThe impact of amplification on differential expression analyses by RNA-seq
Currently, quantitative RNA-seq methods are pushed to work with increasingly small starting amounts of RNA that require amplification. However, it is unclear how much noise or bias amplification introduces and how this affects precision and accuracy of RNA quantification. To assess the effects of amplification, reads that originated from the same RNA molecule (PCR-duplicates) need to be identif...
متن کاملInvestigating the Function of Predicted Proteins from RNA-Seq Data in Holstein and Cholistani Cattle Breeds
This study was performed to determine the digital expression profile of different genes expressed in Holstein and Cholistani breeds as well as to evaluate the performance of predicted proteins derived from differentially expressed genes between these two breeds using RNA-Seq data. For this purpose, the whole mRNA sequence for a blood sample of American Holstein and Pakistani Cholistani cattle p...
متن کاملI-13: Transcriptome Dynamics of Human and Mouse Preimplantation Embryos Revealed by Single Cell RNA-Sequencing
Background: Mammalian preimplantation development is a complex process involving dramatic changes in the transcriptional architecture. However, it is still unclear about the crucial transcriptional network and key hub genes that regulate the proceeding of preimplantation embryos. Materials and Methods: Through single-cell RNAsequencing (RNA-seq) of both human and mouse preimplantation embryos, ...
متن کاملNumerical Solution of fuzzy differential equations of nth-order by Adams-Bashforth method
So far, many methods have been presented to solve the rst-order di erential equations. But, not many studies have been conducted for numerical solution of high-order fuzzy di erential equations. In this research, First, the equation by reducing time, we transform the rst-order equation. Then we have applied Adams-Bashforth multi-step methods for the initial approximation of one order di erentia...
متن کامل